A direct method for the Boltzmann equation 
based on a pseudo-spectral velocity space 

discretization 



G. P. Ghiroldi a , L. Gibelli b '* 

& Politecnico di Milano, Dipartimento di Matematica, Piazza Leonardo da Vinci 

32, 20133 Milano, Italy 

Politecnico di Milano, Dipartimento di Scienze e Tecnologie Aerospaziali, Via La 

Masa 34, 20156 Milano, Italy 



Abstract 

A deterministic method is proposed for solving the Boltzmann equation. The method 
employs a Galerkin discretization of the velocity space and adopts, as trial and test 
functions, the collocation basis functions based on weights and roots of a Gauss- 
Hermite quadrature. This is defined by means of half- and/or full-range Hermite 
polynomials depending whether or not the distribution function presents a discon- 
tinuity in the velocity space. The resulting semi-discrete Boltzmann equation is in 
the form of a system of hyperbolic partial differential equations whose solution can 
be obtained by standard numerical approaches. The spectral rate of convergence of 
the results in the velocity space is shown by solving the spatially uniform homoge- 
neous relaxation to equilibrium of Maxwell molecules. As an application, the two- 
dimensional cavity flow of a gas composed by hard-sphere molecules is studied for 
different Knudsen and Mach numbers. Although computationally demanding, the 
proposed method turns out to be an effective tool for studying low-speed slightly 
rarefied gas flows. 
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1 Introduction 



The conventional continuum approach to gas dynamics, namely the Navier- 
Stokes equations with no-slip boundary conditions, is justified when the a- 
verage distance traveled by molecules between two successive collisions, A, is 
much smaller than a characteristic length, L, associated to the flow geometry. 
This condition breaks down in several physical situations ranging from the re- 
entry of spacecraft in upper planetary atmospheres, characterized by large A, 
to fluid-structure interaction in small-scale micro-electro-mechanical systems, 
characterized by small L. In such situations, a microscopic description of the 
gas based on the Boltzmann equation is required pQ . In the absence of external 
forces, the Boltzmann equation for a gas composed by a single monatomic 
species whose atoms have mass m, takes the form 

d -l + v .v x f = cuj) (i) 

where 

C(f,f)= I I [Pn-ff^o^Vrlk-vMvAdv^k (2) 

J JS 2 

In Eqs. ( fTpj ), f(x,v,t) denotes the distribution function of atomic velocities 
v at spatial location x and time t, whereas C(f, f) gives the collisional rate 
of change of / at the phase space point (x, v) at time t and we have used the 
shorthand /* = f(x,v*,t), f* = f(x,v*,t) and fi = f(x,vi,t). As is clear 
from Eq. (j2l), C(f,f) is a non-linear functional of /, whose precise structure 
depends on the assumed atomic interaction forces through the differential 
cross section <x(||i; r ||,fc • v r ). The dynamics of binary encounters determines 
the differential cross section as a function of the modulus ||« r || of the relative 
velocity v r = V\ — v of two colliding atoms and of the orientation of the unit 
impact vector k with respect to v r . The collisional dynamics also determines 
the pre-collisional velocities v* and v\ which are changed into v and v% by a 
binary collision 



v* = v + (v r ■ k)k 
v\ = Vi — (v r ■ k)k 



(3) 
(4) 



The prevalent approach for numerically solving Eq. (LL]) is a stochastic-based 
method called Direct Simulation Monte Carlo (DMSC) [2]. The distribution 
function is represented by a number of mathematical particles which move in 
the computational domain and collide according to stochastic rules derived 
from Eqs. Q-Q. Macroscopic flow properties are obtained by time averag- 
ing particle properties. If the averaging time is long enough, then accurate 
flow simulations can be obtained by a relatively small number of particles. 
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Variants of DSMC have been proposed over the years to improve solution ac- 
curacy in the presence of high density gradients [3] and for low Mach number 
flows [3]. Although particle-based methods are by far the most effective tools 
in describing non-equilibrium gas flows, they are not well suited to simulate 
unsteady gas flows. Indeed, in this case the possibility of time averaging is 
lost or reduced. Acceptable accuracy can only be achieved by increasing the 
number of simulation particles or superposing several flow snapshots obtained 
from statistically independent simulations of the same flow but, in both cases, 
the computing effort is considerably increased. The simulation of steady gas 
flows in the near continuum limit represents an additional challenge since the 
time scale on which the particle-based methods are forced to operate is much 
shorter than the characteristic macroscopic time and therefore explicit inte- 
gration to steady state is computationally demanding. Approaches based on a 
direct discretization of the Boltzmann equation in the phase space are believed 
to be a feasible alternative in these cases since they provide solutions with 
high accuracy even in unsteady conditions and offer the possibility of a direct 
steady-state formulation [5|6] . As such, they have been applied to study sev- 
eral problems of both theoretical interest and practical importance, including 
the viscous gas damping in microfluidic devices [7f5] . the onset of instability 
in a rarefied gas environment [9] and the investigation of ghost effects [TO] . 
Deterministic methods of solution present some further assets compared to 
particle-based methods. Firstly, they are more suited to be adopted within 
a domain decomposition approach since the need to exchange information 
between kinetic and macroscopic equations requires smooth numerical solu- 
tions [T2lfT3] . Secondly, unlike particle-based methods their implementation on 
massively parallel computers with SIMD architecture, such as multi-core and 
GPU processors, can easily realize the full potential of these supercomput- 
ers [T^lll5fl6j . These aspects also prompted the development of deterministic 
methods of solution. 

Considerable progress has been accomplished in developing deterministic me- 
thod for kinetic model equations [T7p8] . By contrast, an accurate and efficient 
direct solution of the Boltzmann equation itself remains a challenging prob- 
lem. A common strategy adopted for solving Eq. ([I]) consists in decoupling the 
transport and the collision terms by time-splitting the evolution operator into 
a transport step and a collisional step. The transport step requires to solve a 
system of hyperbolic conservation laws coupled at the boundaries. Their dis- 
cretization can be performed in a variety of ways, including finite-difference, 
finite- volume, finite-element or spectral methods [19] . The collision step con- 
sists of solving a spatially homogeneous relaxation equation. This is the more 
computationally demanding part since it involves the computation of the bi- 
linear five- fold integral defining the collision operator, Eq. (j2|. The numerical 
approaches to evaluate the collision step may be grouped into two broad cat- 
egories. To the first category belong methods referred to as discrete velocity 
models (DVM). They make use of a Cartesian grid in velocity space and con- 
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struct a discrete collision mechanics on the points of the grid that preserve the 
main physical properties of the collision integral, namely equilibrium states, 
collision invariants and entropy inequality |20j. DVM methods have high com- 
putational cost and low order of accuracy although fast algorithms have been 
recently developed for a restricted set of collision kernels |21j . To the second 
category belong methods which adopt a Galerkin discretization of the velocity 
space. They are based on expanding the velocity dependence of the distribu- 
tion function in a set of trial functions with expansion coefficients that depend 
on position and time. The Galerkin ansatz is substituted in the space homoge- 
neous relaxation equation which is subsequently multiplied by test functions 
and integrated in the velocity space. According to the Galerkin approach, 
test and trial functions are assumed to be the same. The above procedure 
yields a system of hyperbolic partial differential equations for the expansion 
coefficients. Galerkin methods can be further distinguished depending on the 
basis functions which they employ. In Fourier-Galerkin approach, the distri- 
bution function is expanded in trigonometric polynomials and the fast Fourier 
Transform is used to accelerate the computation of the collision integral in the 
velocity space. Several different methods have been developed starting from 
different representation of the collision integral [2"2"|23|24|l2"o'] . These methods 
are generally very efficient and spectrally accurate for smooth solutions. Their 
major shortcoming is the loss of some of the properties of the solution such 
as positivity and conservation of momentum and energy. Preservation of col- 
lision invariants can be enforced but the use of corrections procedure may 
limit the accuracy of the solutions. Discontinuous Galerkin methods adopt 
discontinuous piecewise polynomials as test and trial functions [26ll27l)28|l29] . 
Although computationally demanding, these methods have the remarkable 
feature to provide spectral accuracy in the velocity space even for discon- 
tinuous solutions which typically occur in the presence of solid surfaces. An 
hybrid approach is adopted in Refs. [3T>1l3"l~] where the distribution function is 
expanded in Laguerre polynomials with respect to the velocity components 
parallel to solid surfaces whereas quadratic finite element functions have been 
used for the normal velocity component. This approach permits to explicitly 
account for the discontinuity of the distribution function and its application 
to one-dimensional problems has been shown to provide accurate numerical 
solutions in both linearized and non-linear regime. 

The main objective of the present work is to propose a new approach to the 
deterministic solution of the Boltzmann equation. The method uses a Galerkin 
discretization of the velocity space and adopts, as trial and test functions, the 
collocation basis functions based on weights and roots of a Gauss-Hermite 
quadrature. The Boltzmann equation is thus simplified to a system of hyper- 
bolic conservation laws with non-linear source terms in the physical space. A 
fully discrete numerical scheme can then be derived by using standard ap- 
proaches [12]. The method proposed in Ref. [32] shows strong similarities with 
the present work but it differs in two significant respects. In Ref. [32], the 
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method is developed for a system of linear Boltzmann equations and the collo- 
cation basis functions have been defined only on the basis of full-range Hermite 
polynomials. By contrast, we are here concerned with the non- linear Boltz- 
mann equation and the roots and weights of the half-range Gauss-Hermite 
quadrature formula also enter in the definition of the collocation basis func- 
tions. Half-range Hermite polynomials have been widely used to study flow 
and heat transfer problems in rarefied gas dynamics [33,34,35j36] and the 
Milne problem in radiative transfer [37]. In comparison with full-range Her- 
mite polynomials, they allow to deal exactly with boundary conditions and to 
achieve a faster convergence rate in the presence of discontinuous distribution 
functions. 

The rest of the paper is organized as follows. Section [2] illustrates the method 
of solution in detail. Section [3] covers two aspects. Firstly, the spectral ac- 
curacy of the proposed velocity space discretization is illustrated by a com- 
parison with the exact solution of the spatially homogeneous relaxation for 
Maxwellian molecules. Secondly, the possibilities of the proposed method are 
demonstrated by solving the two-dimensional driven cavity flow of a gas com- 
posed by hard-sphere molecules for different Knudsen and Mach numbers. 
Section |4] summarizes the main results and the future research directions. 



2 Mathematical formulation 



The numerical scheme to deterministically solve the Boltzmann equation is 
derived through a two-step procedure. As a first step, Eq. ([!]) is rewritten in 
terms of the deviational part of the distribution function, h(x,v,t), defined 
as 

f(x,v,t) = M(v)[l + h(x,v,t)] (5) 

where M(v) is the Maxwellian at equilibrium with uniform and constant den- 
sity no and temperature T , i.e., 

M = H ° „.„ exp ( —| (6) 

(2nRT ) 3/2 P \ 2RT J y } 

As discussed in Refs. [Toll26|28j . the use of h permits to greatly improve the 
accuracy of the method of solution when the gas approaches equilibrium since 
it reduces the truncation errors which arise in evaluating Eq. ^ for distribu- 
tion functions close to Maxwellian. In this respect, an equilibrium distribution 
function which depends on space and time, M(x, v, t), may be also adopted 
in Eq. ^ to further increase accuracy of the computations. The applicability 
of the method, however, is not affected by this choice. Substituting Eq. (J5]) 



5 



into Eq. Q yields 



— + v-V x h = [ [ M 1 [h* + h*-h-h 1 + (h*h*-hh 1 )} 
at Jr3 J S 2 

a(\\v r \\, k ■ v r )\\v r \\dvid 2 k (7) 

It is worth noticing that if the perturbation is sufficiently small, i.e., h —> 0, the 
quadratic terms in h become negligible and Eq. ^ simplifies to the linearized 
Boltzmann equation. The present formulation in terms of the deviational part 
of the distribution function, however, is not restricted to a vanishing pertur- 
bation and it is valid in the non-linear well. 

As a second step, we seek an approximate solution of Eq. Q by using a 
Galerkin approach. A suitable finite dimensional linear space of orthonormal 
trial functions <pi(v) is chosen and the deviational part of the distribution 
function is expanded in a series of the form 



N-1 

h(x,v,t) ~ h N (x,v,t) = ^ hj(x,t)(f)j(v) (8) 

j=0 

Inserting Eq. ^ into Eq. ^ implies a residual depending on the expansion 
coefficients, hj(x,t). These are determined by requiring that the residual is 
orthonormal with respect to some test functions. According to the Galerkin 
method, trial and test functions are assumed to be equal. Therefore, we sub- 
stitute the spectral Galerkin ansatz, Eq. (]§]), into Eq. ([I]), multiply both side 
by 4>i(v) and integrate with respect to the velocity variable v. We thus obtain 
a closed first-order hyperbolic system of conservation laws 

dhi 3 7V ~ 1 dhj N ~ x N ~ x 

-or + H H Ci i a ^ = ^ Lijhj + ^ Qijk h jh (9) 

UL a=l j=0 UX » j=0 j,k=0 

In Eq. ([9]), the coefficient matrices CV, a , and Q^k are given, respectively, 
by 



C ija = M(v)<pi(v)<f>j(v)v a dv (10) 

L i7 = / / M{v)M(v x )1C{<l) i: v : vx) \\v r \\<f>Av)dvdvx (11) 

JR 3 JM.3 

Qijk = ^ / / Af(t>)M(«i)/C(0i,«,«i) \\v r \\(j)j{v)4> k {vx)dvdvi (12) 



The time-independent kernel of the integral operators in Eqs. (11) and (12), 
K (0i,t>,t>i), read, 

£(0 4 ,«,«i)= / [&(«*) + <Pi{v\) - <pi(v) - 0»(«i)] er(||« r ||,fc ■ v r )d 2 k (13) 
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Boundary conditions can be treated similarly. For the sake of simplicity, let 
us consider a fixed plane wall at x 2 = with temperature T . Let us further 
suppose that the gas fills the half space X2 > and molecules which strike 
the wall are re-emitted according to the Maxwell's scattering kernel with com- 
plete accommodation [1] . In terms of the deviational distribution function, the 
boundary condition reads 



h(0,x 2 ,x 3 ,v) = — -W— J- / Mv 2 h(0,x 2 ,x 3 ,v)dv, v 2 > (14) 
n V Rio Jv2<o 



Substituting Eq. d8J) into Eq. (14), multiplying both side by 4>i(v) and inte- 



grating with respect to the velocity variable v, yields 



N-1 

hi=J2 B l3 h 3 (15) 

j=0 



where 



B ij = - — \lwF I M(v)Mv)dv f M(v)i> j (v)v 2 dv (16) 
n v Rio JK 3 Jv 2 <o 



Several methods of solution for the Boltzmann equation employ a Galerkin 
discretization of the velocity space but very different numerical schemes are de- 
rived due to the different trial and test functions [22ll28llMll25ll26T27ll2^ 
In the present work, we adopt the collocation basis functions based on weights, 
Wi, and roots, Zj, of the Gaussian quadrature formula defined by means of half- 
and/or full-range Hermite polynomials, i.e., <pi(v) = Bi(v) with Bi as given by 



Eq. (28) (see Appendix A for further details). A first consequence of this choice 



is that, by virtue of Eq. (34), expansion coefficients are proportional to the 



values of the deviational part of the distribution function at the quadrature 
roots Zi, that is hi(x,t) = y/Wih^lx, Zi,t). The proposed method of solution 
is thus closely related to discrete velocity methods. Furthermore, computing 
the integrals over the velocity space in Eqs. (10)-(12) as finite summations 
according to the Gauss-Hermite formula for numerical quadrature, Eq 
yields 



(26), 



n.. — x. .J a ^ 

N-1 

E 

m=0 



Lij = y/w] W rn \\Zj - Z m \\K, (0j, Zj, Z r 



Qijk = ^yJWjWk \\Zj - Z k \\ JC {4> h Zj,Z k ) 



(17) 

(18) 
(19) 



Likewise, integrals in Eqs. ( 16 ) can be rewritten as 
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B 




J2 V™k 6 j,k4 



(2) 



(20) 



,(2) 



<0 



Matrices given by Eqs. (18), (19) and (20), can be computed with good ac- 
curacy once and for all and used in many individual simulations as long as 
their velocity space discretization is the same. It is worth noticing that Gauss- 
Hermite quadrature formula can integrate exactly polynomials of degree at 
most 2N a — 1 in each velocity component v a . Since integrals in Eqs. (11) 
and (12) may involve polynomials of higher order depending on the differen- 
tial cross section, their numerical evaluation is approximated. The proposed 
method for the numerical solution of the Boltzmann equation has therefore a 
pseudo-spectral nature. 

The collocation basis functions introduced above hold several advantages com- 
pared with alternative choices of functions (pi. Firstly, as shown by Eq. (17), 
they lead to a diagonal system of conservation laws. This simplifies the problem 
of solving the advection step since boundary conditions are explicitly identi- 
fied. Secondly, using half-range Hermite polynomials permits to exactly com- 
pute the half-space integrals appearing in the boundary conditions, Eq. (16). 
This permits to greatly improve accuracy of the solution for boundary value 
problems J33.35J. It is worth pointing out that the use of half-range Hermite 
polynomials is not restricted to a cartesian geometry since, by making use 
of a body fitted coordinate system, arbitrary geometries can be dealt with as 
well 



|. Finally, as it will be shown in Subsec. 3.1 , collision invariants are pre- 
served during the relaxation step within machine precision and no correction 
method to enforce conservation of momentum and energy is thus required. 



3 Results and discussion 



The key feature of the method described in Section [2] is the introduction of a 
grid in the velocity space whose points are the roots of the Gaussian quadra- 
ture formula based on half- and/or full-range Hermite polynomials. This dis- 
cretization of the velocity space has major impact on the evaluation of the 
collision integral. In Subsec. 3^ the proposed method is thus assessed by com- 
puting spatially homogeneous relaxation for Maxwell molecules. Most of the 
spectral approaches for solving the Boltzmann equation have been developed 
for a restricted set of collision kernels which do not include Maxwell molecules 
in the three-dimensional velocity space. By contrast, the method proposed 
here can be carried over to arbitrary particle interactions and therefore we 
can consider the fully three-dimensional relaxation process. The comparison 
with an exact solution will show that mass, momentum and energy are con- 
served and numerical results are spectrally accurate in the velocity space. 
In Subsec. |3.2| the possibilities of the proposed method are demonstrated by 
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v/(RT Q ) 



Fig. 1. Comparison between the BKW distribution function, Eq. (21), evaluated at 
t = i (solid line) and numerical distribution function, fw, for different number of 
discrete velocities along each direction of the velocity space, N (dashed lines). 

solving the driven cavity flow of a gas composed by hard-sphere molecules. 
This is a classical benchmark problem which it is often used for the validation 
of numerical codes. In spite of its simple geometry, in fact, it contains most 
of the features of more complicated problems described by kinetic equations. 
A comprehensive study of this problem has been carried out in Refs. [44] 
for the Bhatnagar-Gross-Krook-Welander kinetic model equation in the lin- 
earized regime. By contrast, only few results are available for the hard-sphere 
Boltzmann equation [15.45J. Subsec. 3.2 is thus of interest in that it provides 



accurate reference solutions albeit for a restricted range of Knudsen and Mach 
numbers. 



3.1 Spatially homogeneous relaxation 



The method described in Sec. [2] is here assessed by comparing the numerical 
solution of the three-dimensional spatially homogeneous relaxation of Maxwell 
molecules with the exact Bobylev-Krook-Wu (BKW) distribution [4Ull41f42] . 
The BKW distribution is radially symmetric and reads 



f(v,t) 



n 



[2na(t)RT c 



,3/2 



exp 



2RT Q a{t) J 



5a(t) - 3 1 



a(t) 



2a(t) 2a(t) 2 RT 



(21) 
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Fig. 2. Error between BKW and numerical distribution function, Eq. (23), versus 
the total number of discrete velocities, N, for two different initial instants of time, 
to = i and to = 2t. 

where n and T are the equilibrium density and temperature, respectively, 
and 

a(t) = 1 



exp 



--nn (3t 



(22) 



being (3 the constant which enters in the definition of the Maxwell's molecules 
differential cross section, i.e., cr(||iv||, k ■ v r ) = /3/\\v r \\. The distribution func- 
tion given by Eq. (21 ) has physical validity only for t >i — (47rno/5) _1 6 log (5/2) 



since it is negative for smaller times. The three-dimensional velocity space is 
discretized using N = N x N x N discrete velocities. Since the BKW dis- 
tribution function is continuous in the velocity space, the discrete velocities 
are the quadrature nodes only based on the full-range Hermite polynomial 
of degree N. A simple first order explicit Euler method is employed to solve 
Eq. ([9]) in which transport terms are neglected, Cy Q = 0, and the coefficient 
matrices, Eqs. (18) and (19), are computed using the differential cross sec- 



tion of Maxwell's molecules. The initial condition is determined by expanding 
Eq. (21) at t — to — i according to Eq. (18]). The ability of the polynomial 



expansion to reproduce the initial condition is shown in Fig. [T] where a com- 
parison is made between the BKW distribution evaluated at to = i (solid line) 
and the distribution function as given by Eq. ^ (dashed lines) for different 
number of expansion polynomials used in each direction of the velocity space 
N. Because of the symmetry of the BKW distribution, odd full-range Hermite 
polynomials do not effectively contribute to the expansion defined in Eq. ^ 
and thus only results for the even values of N have been reported. A relatively 
high number of polynomials is required to obtain an accurate representation 
of the initial condition. This is reasonable since the BKW solution at to = i is 
an highly non-equilibrium distribution function. However, we notice that even 
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Fig. 3. Error between BKW and numerical distribution function, Eq. (23), versus 



dimensionless time for different number of discrete velocities along each direction of 
the velocity space, N. 



if a low number of expansion polynomials is used, i.e., N 



8, the distribution 

functions given by Eqs. (|8|) and (|21|) provide the same mass, momentum and 



energy. As a matter of fact, by construction, the first N moments of h and 
are the same. In order to quantitatively determine the convergence rate of the 
polynomial expansion, we report in Fig. [2] the error between BGW and the 
numerical distribution function for two different initial instants of time, to = i 
and to = 2t, versus the total number of discrete velocities, N. The error has 
been computed according to the formula 



e(t, to) = ^ ^ 3 [/(«, t) - f N (v, t)f dv (23) 
where / is the BKW distribution function, Eq. (21), and /jy is the numerical 



solution. As Figure |2| clearly shows, for a given N, the error at to — 2t is 
lower than the error at to = t, which is not unexpected. Indeed, the higher 
is to the closer is the distribution function to a Maxwellian and therefore 
the smaller is the number of expansion polynomials required to represent the 
distribution function with a given accuracy. Furthermore, independently of the 
initial instant of time, as iV increases, the error decreases with a rate which is 
higher than any algebraic order. The proposed velocity space discretization is 
therefore spectrally accurate. In Fig. ^ the error is reported as a function of 
time for N = 3,5 and 7 with to — t. The most important feature to highlight is 
that the error asymptotically tends to zero up to machine precision whatever is 
the value of N. This behavior indicates that distribution function approaches 
the Maxwellian distribution while conserving mass, momentum and energy. 
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Table 1 

Volume flow rate of the main vortex, G, drag coefficient, D, mean temperature, 0, 
and xi-component of mean heat flux, <3>i, along the moving wall, for different values 
of Knudsen, Kn, and Mach, Ma, numbers. 

3. 2 Two dimensional driven cavity flow 



The method described in Sec. [2] is here applied to solve the two-dimensional 
driven cavity flow of a gas composed by hard-sphere molecules. The prob- 
lem geometry consists of a square enclosure with length L. All the walls are 
kept at uniform and constant temperature To. Initially, the gas is in uniform 
equilibrium with density n and temperature T . The flow is driven by the 
translation of the lid of the cavity with constant velocity V w [44]. 
The cavity flow problem has been solved for Mach numbers in the range 
[0,0.4] and for two values of the Knudsen number, Kn = 0.02 and 0.1. The 
Mach and Knudsen numbers are defined, respectively, as Ma = V w /a with 
a = (5/3-RTq) 1 / 2 the speed of sound, and Kn = \o/L with A = Ho/po(2RT ) 1 ^ 2 
and Ho the viscosity of the hard sphere gas pQ. Numerical results have been 
obtained by solving Eq. ^ where the coefficient matrices, Eqs. (17)-(19), are 
computed with the hard-sphere differential cross section, i.e., cr(||v r ||, k-v r ) = 
o? 2 /4, being d the molecules' diameter. The solution in one time step has been 
obtained by first integrating the transport equation, Eq. (|9]) with Ly = = 
0, and afterwards the space homogeneous equation, Eq. Q with Cy Q = 0, us- 
ing the output of the previous step as initial condition. Iterating the process 
provides the numerical solution at later times. The three-dimensional velocity 
space is discretized using N = 8 x 8 x 4 discrete velocities. Because of the 
presence of solid walls, the discrete velocities along the X\ and Xi directions 
are the quadrature nodes of the positive and negative 4t/i-order half-range 
Hermite polynomials whereas discrete velocities along the £3 direction are the 
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the quadrature nodes of the Ath-order full-range Hermite polynomial. The 
two-dimensional physical space has been divided into M = 200 x 200 rect- 
angular cells. Their lengths have been uniformly stretched with a progression 
ratio R = 0.04 for Kn = 0.02 and R = 0.4 for Kn = 0.1. The smaller cells 
are located close to the upper corners of the cavity where severe gradients 
are anticipated. The collision and transport steps have been solved by using 
simple first-order explicit Euler and upwind schemes. 

We firstly establish the convergence rate of the method by computing two 
global flow field properties, namely the drag coefficient, D, and the flow rate 
of the main vortex, G. We refer to Refs. [T5f4*6"] for precise definitions of all 
the macroscopic quantities used in the present Section. The absolute relative 
error in the long-term dimensionless values of D and G are shown in Figs |4^i 
and |4jD, respectively, versus the dimensionless spatial grid size, l/y/M, for 
two values of the Knudsen number, Kn = 0.02 (solid symbols) and Kn = 0.1 
(empty symbols), and two values of the Mach number, Ma — > (squares) and 
Ma = 0.4 (circles). It is worth noticing that the scheme which has been used 
for numerically solving the Boltzmann equation in the linearized regime, that 
is for Ma — > 0, is provided by Eq. ^ in which the second term on the right 
hand side is disregarded. The exact values of D and G, which are referred 
to as -D e (Ma) and G e (Ma), have been extrapolated from the linear fit of the 
results when 1/a/M — y 0. The linear behavior of the absolute relative errors 
demonstrates that the results are in the asymptotic range of convergence and 
the method is overall first order accurate in the physical space [47] . Table [I] 
shows the numerical predictions of the volume flow rate of the main vortex 
G, the drag coefficient, D, the mean temperature, O, and the xi-component 
of the mean heat flux, $i, along the moving wall. The error bound for G and 
D is also reported for Ma — > and Ma = 0.4. The largest percentage errors 
in D and G are about 2.6% and 0.66% and they are attained at Kn = 0.02 
in the linearized regime. Results for intermediate Mach numbers are expected 
to retain a similar level of accuracy. Figures [5] and [6] show the profiles of the 
dimensionless horizontal component of the velocity, V\/V W1 along the vertical 
line crossing the center of the cavity and the dimensionless vertical component 
of the velocity, VijV w , along the horizontal line crossing the center of the cav- 
ity, respectively, for Kn = 0.02 and Mach numbers in the range [0,0.4]. The 
possibilities of the proposed method of solution are further illustrated by the 
computations reported in Figs. [7j and [8j Fig. [7j shows snapshots of the heat 
flux lines superimposed to the temperature field (T — T )/T for two values of 
the Mach number, Ma = 0.01 (left panel) and Ma = 0.4 (right panel) whereas 
in Fig. [8] the time evolution of the drag coefficient, D, is reported during the 
simulation for Kn = 0.02 and Mach numbers in the range [0, 0.4]. These results 
would be difficult to obtain with a particle-based method since computation- 
ally expensive time and/or ensemble averaging are needed to provide such 
smooth macroscopic fields. 
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Fig. 4. Absolute relative error on (a) mean flow rate, G, and (b) drag coefficient, 
D, for two values of the Knudsen number, Kn = 0.02 (solid symbols) and Kn = 0.1 
(empty symbols), and two values of the Mach number, Ma — > (squares) and 
Ma = 0.4 (circles), versus the size of the grid in the physical space, 1/M 1 / 2 . Dashed 
lines are the least-mean square fit of the results. 



We conclude with a few remarks on the parallel implementation of the pro- 
posed method of solution. Profiling of the serial code has highlighted that 
the collision step has a computational cost which is four order of magnitude 
greater than the computational cost of the streaming step and it takes about 
99% of the total simulation time. Therefore, only the collision step has been 
parallelized leaving the transport step performed by all MPI processes. Matrix- 
vector operations involved in the collision step are independent along i, j and 
k indices. As parallelization strategy we decided to implement an hybrid so- 
lution decomposing the simulation along the % index with OpenMP and the k 
index with MPI. Simulations have been performed on a cluster which is com- 
prised by 128 nodes, each node having 12 cores Intel Xeon X5660 and 24 GB 
of RAM. The nodes are interconnected by an Infiniband QDR/DDR Voltaire 
network. Benchmarks showed that the parallelization strategy which has been 
adopted yields an efficiency greater than 70% up to 128 cores. The results 
reported in this Section were obtained using 64 cores, with a memory require- 
ment for each simulation of about 100 MB of RAM and a computational cost 
of the order of 0(M x N 3 ) floating point operations per time iteration, being 
M = 400000 and N = 256 the total numbers of cells in the physical and 
velocity space, respectively. The time required to execute one time iteration 
was about 13 s. 
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Fig. 5. Profiles of the dimensionless horizontal mean velocity along the vertical line 
crossing the center of the cavity, V\/V w for Kn = 0.02 and different Mach numbers. 
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Fig. 8. Drag coefficient versus dimensionless time for different Mach numbers, Ma. 
Kn = 0.02. 

4 Conclusion 



We developed a deterministic method of solution for the Boltzmann equation 
based on a pseudo-spectral Galerkin discretization of the velocity space. The 
novel aspect of the proposed method is the choice of trial and test functions, 
namely the collocation basis functions related to weights and roots of a Gauss- 
Hermite quadrature formula defined on the basis of half- and/or full-range 
Hermite polynomials, Eqs. (28). The use of half-range Hermite polynomials 
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is an important feature of the method since it permits a straightforward and 
exact formulation of boundary conditions. Due to this discretization of the 
velocity space, the Boltzmann equation simplifies to a system of hyperbolic 
conservation laws with non-linear source terms in the physical space, Eqs. (J9|. 
A fully discrete numerical scheme is derived by a first-order time splitting of 
the evolution operator which couples a first-order finite- volume scheme for the 
transport step with a first-order explicit Euler scheme for the relaxation step. 
However, more sophisticated methods can be used for coupling and solving 
both transport and relaxation steps. Numerical results have been presented 
for the space homogeneous relaxation to equilibrium of Maxwell molecules and 
the two-dimensional cavity flow of a gas composed by hard-sphere molecules. 
Overall, the numerical results are affected by an error of less than a few per- 
cent. 

The proposed method of solution has a number of attracting features such 
as explicit identification and exact treatment of boundary conditions, spec- 
tral accuracy in the velocity space and preservation of the collision invariants. 
While the method can in principle handle a wide range of Knudsen and Mach 
numbers, it turns out to be feasible only in dealing with slightly rarefied low 
Mach number gas flows. The main obstacle is the large number of operations 
required to evaluate the collision integral. However, a preliminary study in- 
dicates that the numerical code possesses very good scalability on a memory 
distributed cluster of GPUs and supercomputing architectures like IBM Blue 
Gene/Q. A throughly discussion of these parallel implementations will be the 
subject of a future publication. 
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Appendix: Half-range Gauss-Hermite quadrature 

We introduce polynomials in three variables defined as the Cartesian product 
of polynomials in one variable and we organize them according to the lexico- 



graphical order, i.e., Vi(v) = V^\vi)V-^(v 2 )V^\v 3 ) where i = i 1 + N t i 2 + 
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NiN 2 t3, i a = 0, ■ ■ ■ , N a — 1 with a = 1, 2, 3 and iV a the number of polynomials 
in the v a velocity component. By applying the Gram-Schmidt procedure to 
the dimensionless monomial powers 1, v a / (RTo) 1 ^ 2 , v 2 J (RT ), . . ., polynomials 
Vi^ can be constructed that are orthonormal in the range [a a , b a ] with respect 
to the weighted inner product defined as 



M(v a )P£\v a )P£>{v a ) dv a = 5 iada (24) 

denotes the Kronecker delta function which is 1 for i a = j a and 
otherwise. In Eq. (24), M(v a ) is the Maxwellian weight function with unit 
mass which reads 



>(«)/ 



where 5,- 



let ija 



M(v a ) = , 1 exp f %- ] (25) 

V aJ V^Mh 1 \ 2RT J v ; 

As a matter of fact, polynomials v\ can be regarded as defined over the 
whole real axis by setting their values to zero outside the interval [a a , b a ]. 
The polynomials afore introduced permit to define the following quadrature 
formula 



JV1-1JV2-1JV3-1 

M(v)f(v)dv^ £ E <W/(^U 2) .^) ( 26 ) 

JR ii=0 i 2 =0 i 3 =0 

where are the roots of the polynomial and are quadrature weights 
which are determined by solving the linear systems of equations 



N a -l 

E ^(^H? = V^, <a = 0, . . . , N a - 1 (27) 



where a = 1,2,3. The quadrature formula given by Eq. (26) can be shown 
to provide the correct result for the integral of functions that are products of 
polynomials in each velocity component v a up to degree 2N a — 1, as long as 
their support is contained in [ai,&i] x [a 2 ,&2] x [03,63]. 

In the present work, three polynomial sets are used that satisfy Eq. (24) in a 
different interval [a a , b a ]. The support of full-range Hermite polynomials, H( , 
is (a a ,b a ) = (—00, +00) whereas the support of negative, H™ , and positive, 
Hf a , half-range Hermite polynomials are (a a ,b a ) = (— 00, 0) and (a a ,b a ) = 
(0,+oo), respectively. Zeros and weig ths of H{ ,H? and Hf are referred to 
and w{ a , tyf , wf a , respectively. It is not difficult to prove that the 
zeros of full-range Hermite polynomials are symmetric about the zero velocity, 
i.e., z( a = , whereas the zeros of positive and negative half-range 

Hermite polynomials have equal magnitude but different sign, i.e., z™ = —zf a . 
The collocation basis functions introduced in Section [2] are based on roots and 
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weights of the Gauss- Hermite quadrature formula defined by Eq. (26) for the 
polynomial set v\ a ^ . More precisely, the collocation basis functions read 



B i (v 1 ,v 2 ,v 3 ) 



where z\ and w\"' are the nodes and weights of the quadrature formula 
given by Eq. (26) based on the polynomials V^. These are defined differently 



JVi-l jv 2 -i JV3-1 

E E E 

ii=o j2=o j3=o 



,(«) 



^Vi)^? (^2)^(^3) (28) 



(3), 



depending whether or not the deviational part of the distribution function is 
expected to have a discontinuity in the v a velocity component. More precisely, 
if h is continuous in the velocity space then V- a ^ = H- and z- = z{ , 

w i = w { ■ Instead, if h presents a discontinuity in the v a velocity component 
then 



V 



(a) 



m 

La 

ttP 

n i a -N a /2 



Likewise 



and 



» 



Z i a ~N a /2 l c 



= 0,...,iV a /2-l 
= NJ2, . . . , N a - 1 

0, . . . , NJ2 - 1 
N a /2,...,N a -l 



(29) 



(30) 



W: 



0,...,iV Q /2-l 
N a /2,...,N a -l 

Equation (|28h can be rewritten with a more compact notation as follows 



W L-N a /2 ^ 



(31) 



JV-l 



Bi(v) = E <MPi(*i)'PM 

j=0 



(32) 



By using the quadrature formula given by Eq. (26), it is possible to prove 



that the collocation basis functions defined by Eq. (28) are orthonormal with 
respect to the Maxwellian weight factor, 



M(v)Bi(v)Bj(v)dv = Sij (33) 
and that the z-th basis function vanishes on all nodes but node % 



B t ( Zj ) = 6 tj /^F t (34) 

For illustrative purposes, in Figs[9j we report the collocation basis functions in 
one-variable based on the quadrature nodes of the full-range Hermite polyno- 
mials (left panel) and positive half-range Hermite polynomials (right panel). 



19 




Fig. 9. Collocation basis function based on (a) full-range Hermite polynomials, 
N = 3 and (b) positive half-range Hermite polynomials, N = 6. 
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